

# Design of Sychronous Binary Counter with Backward Carry Propagation

Vamshi Krishna Madam, Chand Shabana Mohammad, Shivanjali Boja

ECE. Sreenidhi Institute Of Science Technology(JNTU) Hyderabad,Telangana, India

Date of Submission: 01-01-2023

Date of Acceptance: 10-01-2023

ABSTRACT—A synchronous binary counter is one of the prelim factors vastly used in VLSI design, and it's required to be fast and support a wide bit- range in many operations. still, ultimate of the past counters are associated with a limited counting rate due to large addict- outs and long carry chains, particularly when the counter size isn't small. This paper proposes a new fast building for synchronous binary counting, which has a minimum counting period for practical counter sizes ranging from 8 to 128 bits. We first take on an 1- bit Johnson counter to reduce the overall hardware complexity, and also copy the 1- bit Johnson counter to drop the propagation delay caused by large fan- outs. Implementation results show that the posed design can be realized with a small number of flip- duds, which is closely direct to the counter size, and it operates at a timepiece frequency of 2 GHz in a 65nm CMOS technology, being restricted only by the counting rate of the least significant bit Keywords-Backward carry propagation, binary

counter, constant- time counter, propagation, binary

#### I. INTRODUCTION

Counter is one of the beginning components actively used in many operations similar as measurement systems, analog- to- digital converters, frequency dividers, phase- locked- loop frequency synthesizers, and so on. Due to recent advances in the exercises, it's generally challenged to apply a fast, wide counter supporting a constant counting rate independent of the counter size. still, the counting rate and the size conflict with each other, because the carry propagation from a low- order bit to a highorder bit becomes longer as the counter size gets larger. Asynchronous counters, sometimes called ripple counters, can be realized with a small number of logic gates, but the accumulated delay caused by the ripple propagation produces false outputs for a short period of time because the flip- flops( F/ Fs) are connected to different clock signals. The ripple effect becomes worse when the counter size is large and can be critical for some exercises where the counting value needs to be stable.

To gain a stable dual output, a synchronous binary counter can be used. The simplest synchronous counter is the ripple carry counter in which the carryout of an one- bit adder is linked to the carry- in of the succeeding stage. The chain of carry signals is called a ripple carry chain, as the carry signal is continually rippled into the coming stage. The main limiting factor of the speed of a synchronic counter is the long carry propagation caused by the carry chain. There have been many ways developed for fast adders, which have also been applied to decide fast counters(1). The ripple carry chain in the traditional binary counter was replaced with a carry-lookahead circuit in order to attain a significant speedup(2). In addition, a hierarchical Manchester carry chain was used for carry propagation in(3), and a statelookahead topology was used in(4) to break the carry chain by adding D F/ Fs, shirking the splashing. In( 5), the carry chain was constructed with employing a tree structure. still, regarding a counter as a synthesis of an adder and a state register isn't effective in achieving a constant clock period, since the lower bound of the adder delay isn't constant. There have been other efforts to speed up the counter by perfecting the F/ F. For case, high-speed synchronous counters were developed by using the F/ F grounded on the true single- phase clock(2), (6).

still, a counter associated with a constant clock period can be achieved by employing a state generator, If fast coetaneous counting is only demanded rather of the binary sequence. For case, a pipelined carry propagation chain was presented in( 7),(8) by taking systolic structures, but it doubles the number of F/ Fs required as well as the overall hardware complexity. Another approach to realize a state creator is to use a direct-feedback shift register( LFSR)(3),(9), but it demands large span-new circuits to convert the state order to a binary value and make the number of states a power of two.



To conclude both constant delay and binary sequence, another carry propagation system called backward carry propagation was presented in(10). It exploits the characteristics of a binary sequence that the more significant bits approach high earlier than the less significant bits. This way can be appertained to attain a stable- delay counter since the carry propagation is only determined by the least significant bit( LSB). still, the LSB has to drive all F/ Fs of the counter, directing to a large fan- out trouble. In other terms, the number of input ports connected to the LSB exceeds the utmost value that can be drived by the LSB. In addition, another synchronous double counter based on prescaling was presented in(11). A broad counter is partitioned into subblocks. The highorder block is enabled by a prescaledenable(PEN) signal generated from the low- order block, and the clock period of a prescaled counter is refereed by the least suggestive block. still, there are still issues relatedTo drive a large number of the write enable devices, a PEN signal must be widely distributed and fan-out.

Inputs to the following block's F/Fs. The massive fanout is actually the key problem that must be resolved in order to implement a quick binary counter. The fanout problem gets worse as the counter size grows, lengthening the propagation latency.

In this study, we describe a binary synchronous counter with a constant latency that can handle practical counter sizes up to 128 bits. The onebit Johnson counter is duplicated in the suggested counter to address the big fan-out problem, and the backward carry propagation approach is used to eliminate the ripple carry propagation's extra latency. The suggested counter achieves the maximum counting rate, and regardless of counter size, the counting rate is only influenced by the leastsignificant 1-bit counter.

## **II. PREVIOUS COUNTERS**

This section provides a detailed description of various earlier publications whose ideas are pertinent to the suggested counter The typical synchronised binary counter is shown in

Figure 1 shows a collection of T F/Fs with an internal XOR gate. Be aware that the simplest form of logic required to alter the counter value by a little amount is an XOR. For the purpose of simplicity, the XOR gate and D F/F combination are combined as a T F/F in this work. When the lower bits are all 1s, the chain of AND gates yields a carry. The chain of AND gates successively analyses all the lower bits beginning with the LSB. A ripple carry chain is a lengthy route made up of AND gates that are serially connected. The least significant bit (LSB) is carried forward by the ripple carry chain to the most significant bit (MSB). Only after all of the lower carry-out signals have stabilised can an AND gate be carried out, and the propagation route lengthens as the counter size grows.

Because of this, the linear relationship between the standard binary counter's latency and counter size is significant.

Implementing a fast synchronous counter requires understanding the backward carry propagation idea, which was first introduced in [10]. It functions because of the properties of the binary number system, which cause a more significant bit of the counter to hit high earlier than the least significant bit. As shown in Fig. 2, each counter bit has a distinct AND chain connected in the reverse direction as opposed to the one chain utilised in the standard binary counter. In a carry chain, the recently available signals arrive after the early arriving signals have been analysed. The final AND gate in the carry chain must be coupled to the fast-changing signal coming from the LSB, which suggests that it controls the critical route delay of the backward carry propagation. As a result, the final AND gate's delay and a T F/F affect the propagation delay the most. However, as seen in Fig. 2, the LSB, Q[0] in this image, is coupled to every AND chain and has a significant fan-out. In other words, the LSB's load is too heavy to drive quickly, which depends on the fanout for the critical route delay.

Fig. 3 shows the constant-delay binary counter based on prescaling that was introduced in [11]. When a large counter is divided into smaller subblocks, the high-order subblock runs at a lower frequency than the low-order subblock. The primary idea is to add one to a high-order block in accordance with the PEN signal produced by a prescaler, a loworder block. A PEN has a frequency that is substantially lower than a clock signal. The high-order block can do this because it is increased considerably less frequently than the low-order block. The use of a ring counter to produce the PEN signal nearly doubles the level of complexity. The least significant block with an XOR gate's theoretical delay, a D F/loading F's delay, and a D F/setup F's time determines the clock period. However, in reality, the fan-out required to disperse the count enable signal (CNT) and the PEN significantly reduces the counting rate. The ring counter and the next subblock require a significant number of F/Fs to be driven by the signals. . For instance, the CNT in Fig. 3 is used to allow the shift operation of three ring counters with a total of 74 D F/Fs, and the PEN output of the 64-bit ring counter drives 58 D F/Fs in the next 58-bit subblock. The fanout issue becomes more serious when the counter size is increased, and the PEN signal's propagation delay finally surpasses the intended minimal propagation



delay. The actual latency, which is lengthened by big fan-out nodes, is very different from the theoretical time.





Fig. 2. 8-bit synchronous binary counter designed with backward carry propagation.



Fig. 3. 64-bit prescaled counter that generates prescaled enable signals with

By taking advantage of the redundant 1-bit Johnson counters created by duplicating them and using backwardcarry propagation to create the PEN signal, the suggested method reduces the fan-out delay and hardware complexity. The counter that is being suggested to obtain a high counting rate for useful counter sizes.



FIG. 4. DETAILED STRUCTURE OF THE PROPOSED N-BIT COUNTER.

## **III. PROPOSED COUNTER**

Fig. 4 depicts the suggested N-bit counter. Let's suppose for the sake of simplicity that n = "log2 N" and m = "(N n)/L," where L is the maximum fanout to be found through running simulations. In order to benefit from prescaling, an N-bit counter is divided into three distinct subcounters, and m 1-bit Johnson counters are utilised to produce m PEN signals for the last subcounter. The Johnson counter is initialised to 0, and when the Johnson counter switches from 0 to 1, the PEN signal is produced to allow the counting of the following subcounter.

#### A. Counter Block

The function of a counter block is to act as a sequence generator, counting from 00...000 to 11...111. A counter typically consists of a register component, which stores the current state, and a combinational incrementer, which calculates the subsequent value. The incrementer's computation time primarily sets a cap on the counting pace. Prescaling can help reduce the incrementer's latency. An N-bit counter is accomplished in the suggested counter architecture by being divided into the three subcounters C1, C2, and C3, as illustrated in Fig.4. The 1-bit subcounter C1 alternates between 0 and 1 per clock. The final subcounter, C3, is a (N-n)-bit conventional binary counter. Subcounter C2 is a (n-1)-bit counter that operates using backward carry propagation.

The partitioned counter's fundamental idea is to prescale the high-order block while taking into account the low-order block. The propagation delay of the (N-n)-bit synchronous ripple carry binary counter C3, which is made up of (N-n-1) AND gates, is split up into three subcounters in an N-bit counter so that it is shorter than the period of PEN2 produced in C2.

The carry propagation in C3 can be stabilised before the next PEN2 leaves C2 since the period of PEN2 is 2 n clock cycles [11]. The 1-bit counter C1 also serves as the enabler for subcounter C2, a (n-1)-bit backward carry propagation counter. We can ensure that the carry propagation of C2 is quicker than the time of PEN1 created in C1 by observing that the long carry chain's delay is reduced to only one AND gate by using backward carry propagation. We only have 3 subcounters as a result, and unlike [11], the partitioning procedure is not applied to the subcounters in a recursive manner. The AND gate, XOR gate, and loading delay added together, together with the carry propagation delay of a D F/F, make up C2's carry propagation delay. Note that the first bit's fan-out impact is minimal enough to be insignificant due to the size of the (n-1) If the setup time of a D F/F is taken into account while determining the minimum clock period Furthermore, the period of PEN1 created in C1 is 2 clock cycles, and the carry propagation delay of C2 is always quicker than this period. As a result, the least important subcounter C1 does, in fact, decide the clock period.



# **B.** Generating Prescaled Enable Signal

The PEN signal should be synchronised with the clock in the prescaled counter and the fan-effect out's on its delay should be minimal. A ring or twisted-tail counter is typically used to create the PEN [11]. The last F/output F's is connected to the first F/input F's via the ring counter, creating a circle. The PEN signal changes to 1 when the n-bit ring counter reaches a value of 2 n1. Similar to this, the Johnson counter or n-bit twisted-tail counter, which connects the input of the first F/F to the inverted output of the last F/F, activates the PEN signal when the count value reaches 2 n1. Due to the absence of a combinational circuit between neighbouring F/Fs, which enables the PEN to be synchronous with the clock, they can function at a high frequency. The method is ineffective, though, as N F/Fs are required to travel through N states, adding to the complexity of the hardware. Furthermore, the PEN signal must drive every F/F in the following partition, resulting in a large fan-out, a longer propagation latency, and a slower overall counting speed. A 64-bit counter, for instance, might be constructed from a 1-bit subcounter C1, a 5-bit subcounter C2, a 58-bit subcounter C3, a 2-bit ring counter generating PEN1, and a 64-bit ring counter generating PEN2. PEN1's fan-out is negligible enough to be disregarded. However, because C3's PEN2 drives 58 enable ports, the delay brought on by the significant fan-out should be taken into account when designing the circuit.







The propagation delay of PEN2 produces the critical route, preventing the counter from having a minimal clock period, as will be seen in the simulation and implementation results later. As shown in Fig. 5, a 5-bit backward carry propagation counter and a PEN generator are exhibited for N = 64, n = 6, and m = 4 to address the fan-out problem by substituting a 1-bit Johnson counter for a 2-n-bit ring counter. When enabled after being initialised to 0, the 1-bit Johnson counter's status changes. In order to make PEN2 be 1 at the (2n-1)th cycle, or 63rd cycle in the example, our objective is to make PEN2 have a pulse every 2example. Utilizing the backward carry propagation technique shown in Fig. 6 can produce such a signal. By using backward AND chains, the AND operation of Q[5], Q[4], Q[3], and Q[2] can be realised.

The shift from low to high in Q[2] causes the Q[5]&Q[4]&Q[3]&Q[2] signal to reach high, which lasts for four cycles. In order to make the output of the AND chain high for two cycles, the last AND gate is coupled to the tardy signal Q[1]. Because &Q[5:2] has already been computed due to backward carry propagation, the enable signal is equal to the output of &Q[5:1], requiring only one AND gate to do the computation. Every 64 cycles, the enable signal regularly repeats itself and is high at cycles 62 and 63.

As a result, once every 64 cycles, PEN2 spikes. In other words, PEN2 is equal to &Q[5:0], which is calculated using the counter's minimum 6 bits.

As shown in Fig. 5, the 1-bit Johnson counter can be redundantly repeated to handle massive fan-out nodes. Calculating m = (N n)/L, where L is the total number of input ports that an F/F can drive, yields the number of redundant Johnson counters, m. m is often less than 8 for N from 8 to 128 bits. The maximum number of redundant F/Fs is 8, so the redundancy's added complexity only makes up a small part of the total counter. To avoid lowering the counting rate, a Johnson counter's maximum fan-out is set to 16 by running numerous simulations. The same signal drives all of the Johnson counters, which results in m identical PEN2 signals being produced. In order to push up to L F/Fs of the following subcounter C3, each PEN2 is allocated equally.

#### IV. COMPARISON AND IMPLEMENTATION

The performance of the proposed counter is compared in this section to three other counters: the ordinary binary counter, the backward carry propagation counter, and the prescaled counter using ring counters.





Fig. 7. (a) Delay of prescaled enable signal 2. (b) Maximum counting rate



Fig. 8. (a) Combinational equivalent gate counts and flip-flops. (b) Total equivalent gate counts.

The hardware complexity and maximum clock frequency for counter sizes ranging from 8 to 128 bits have been investigated through performance study. The setup time of a D F/F is also taken into account when determining the maximum clock frequency; however, the propagation delay does not do so. A 65nm standard cell library was used to implement the counters, which were then examined by Synopsys Design Compiler and simulated using parasitic resistors and capacitors that were taken out of the architecture. The prescaled counter has two crucial pathways. The first path is connected to the propagation delay of PEN2, and it is found in the 1bit counter block. It consists of an XOR gate and a D F/F. Fig. 7 shows the propagation delay of PEN2 (a). The upper bound of the permitted delay, which is the sum of an XOR delay and an F/F delay, is exceeded by the delay of [11], which continues to climb linearly with the size of C3. The fan-out is limited to less than 16 thanks to the redundant PEN2 signals, therefore the suggested counter's delay is always less than the upper boundary.

In reality, a massive fan-out associated with PEN2 propagation is the real critical path [11].

The XOR gate in the rightmost 1-bit counter is the sole component of the proposed counter that affects the total delay because, in contrast, PEN2 is associated with an almost constant delay and is not on a critical path.

The big fan-out node is the fundamental constraint on the counter structures in [10] and [11]. The maximum frequency of 2 GHz in [11] can only be reached for tiny counters up to 15 bits. But when the counter size grows bigger, the enormous fan-out slows down the counting rate. The suggested counter has a nearly constant 2GHz counting frequency that is almost independent of counter size up to 128 bits.

Figure 8 illustrates the total equivalent gate (EG) counts needed for each counter. An EG is a 2input NAND gate, whereas an XOR gate and a D F/F are treated as 2.25 EGs and 7.5 EGs, respectively, when taking into account the typical cell library utilised in the studies. While the combinational gate counts in [10] rise linearly in the conventional counter and the proposed counter, they increase exponentially in [10] up to the counter size. Given that the size of the ring counter used to generate PEN2 is comparable to the counter size, the total number of F/Fs in [11] is almost two times the counter size.

## **V. CONCLUSION**

We have suggested a brand-new synchronous binary counter design in this research, whose delay is nearly constant for realistic counter sizes. The number of flip-flops and the undesired propagation delay brought on by large fan-out nodes were reduced in the proposed counter design by utilising backward carry propagation and redundant 1-bit Johnson counters.

The suggested counter operates at 2GHz in a 65nm CMOS technology, which is almost independent of the counter size, and can be implemented with a modest number of flip-flops, which is somewhat more than the counter size.

#### ACKNOWLEDGEMENT

The IC Design Education Center (IDEC), Korea is to be commended for helping to support the EDA tool, the author says.

## REFERENCES

- [1] Stan, A.F. Tenca, M.D. Ercegovac, "Long and fast up/down counters", Computers IEEE Transactions on, vol. 47, no. 7, pp. 722-735, 1998.
- J. -. Yuan, "Efficient CMOS counter circuits," in Electronics Letters, vol. 24, no. 21, pp. 1311-1313, 13 Oct. 1988, doi:10.1049/el:19880891.



- [3] M. Kondo and T. Watnaba, "Synchronous Counter," U.S. Patent no. 5,526,393, June 1996.
- [4] M. Ercegovac and T. Lang, "Binary counter with counting period of one half adder independent of counter size," in IEEE Transactions on Circuits and Systems, vol. 36, no. 6.
- [5] P. Larsson and J. Yuan, "Novel carry propagation in high-speed synchronous counters and dividers," in Electronics Letters, vol. 29, no. 16, pp. 1457-1458, 5 Aug. 1993, doi: 10.1049/el:19930975.